Overview

Dataset statistics

Number of variables11
Number of observations999
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory86.0 KiB
Average record size in memory88.1 B

Variable types

Numeric11

Warnings

1 has unique values Unique
2 has unique values Unique
3 has unique values Unique
4 has unique values Unique
5 has unique values Unique
6 has unique values Unique
7 has unique values Unique
8 has unique values Unique
9 has unique values Unique
10 has unique values Unique

Reproduction

Analysis started2021-02-28 01:08:24.791121
Analysis finished2021-02-28 01:08:48.155494
Duration23.36 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

1
Real number (ℝ≥0)

UNIQUE

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4908268595
Minimum0.0001881734934
Maximum0.9995660062
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum0.0001881734934
5-th percentile0.04234476623
Q10.2189356901
median0.5048459771
Q30.7384930294
95-th percentile0.9483232349
Maximum0.9995660062
Range0.9993778327
Interquartile range (IQR)0.5195573393

Descriptive statistics

Standard deviation0.2938751385
Coefficient of variation (CV)0.5987348345
Kurtosis-1.23314922
Mean0.4908268595
Median Absolute Deviation (MAD)0.2618995951
Skewness-0.0001171731307
Sum490.3360327
Variance0.08636259704
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.1057433771
 
0.1%
0.4286950811
 
0.1%
0.16284309051
 
0.1%
0.97554720711
 
0.1%
0.013673885731
 
0.1%
0.13555972931
 
0.1%
0.57097948091
 
0.1%
0.50860671891
 
0.1%
0.99956600621
 
0.1%
0.080902387161
 
0.1%
Other values (989)989
99.0%
ValueCountFrequency (%)
0.00018817349341
0.1%
0.0002376805061
0.1%
0.00033542374151
0.1%
0.00044957455251
0.1%
0.0022745181341
0.1%
ValueCountFrequency (%)
0.99956600621
0.1%
0.99900504041
0.1%
0.99822765981
0.1%
0.995723211
0.1%
0.99468274161
0.1%

2
Real number (ℝ≥0)

UNIQUE

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4993099365
Minimum8.508865722 × 105
Maximum0.9999592176
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum8.508865722 × 105
5-th percentile0.05429990666
Q10.2620030111
median0.5005389699
Q30.7338527376
95-th percentile0.9502933578
Maximum0.9999592176
Range0.999874129
Interquartile range (IQR)0.4718497265

Descriptive statistics

Standard deviation0.2785678267
Coefficient of variation (CV)0.557905634
Kurtosis-1.099712965
Mean0.4993099365
Median Absolute Deviation (MAD)0.2366037248
Skewness-0.002575696342
Sum498.8106266
Variance0.07760003406
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.7209422471
 
0.1%
0.77821274061
 
0.1%
0.045805654261
 
0.1%
0.53029804541
 
0.1%
0.18510164111
 
0.1%
0.67279566241
 
0.1%
0.69912410091
 
0.1%
0.94859733251
 
0.1%
0.24602867851
 
0.1%
0.44467694661
 
0.1%
Other values (989)989
99.0%
ValueCountFrequency (%)
8.508865722 × 1051
0.1%
0.001920715441
0.1%
0.0045105258471
0.1%
0.0073822664561
0.1%
0.0086158781781
0.1%
ValueCountFrequency (%)
0.99995921761
0.1%
0.99966165191
0.1%
0.99849388191
0.1%
0.99680876431
0.1%
0.99633405031
0.1%

3
Real number (ℝ≥0)

UNIQUE

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.496888229
Minimum0.003109625774
Maximum0.9989182351
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum0.003109625774
5-th percentile0.04575146399
Q10.2317364716
median0.4845410262
Q30.7651983585
95-th percentile0.9440981801
Maximum0.9989182351
Range0.9958086093
Interquartile range (IQR)0.5334618868

Descriptive statistics

Standard deviation0.2978736103
Coefficient of variation (CV)0.5994780977
Kurtosis-1.315259159
Mean0.496888229
Median Absolute Deviation (MAD)0.268836021
Skewness0.01546682282
Sum496.3913407
Variance0.08872868769
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.62420638181
 
0.1%
0.23869082911
 
0.1%
0.86120002671
 
0.1%
0.56716493521
 
0.1%
0.94136055371
 
0.1%
0.83232342271
 
0.1%
0.20238120851
 
0.1%
0.54551448181
 
0.1%
0.38554555571
 
0.1%
0.99051517571
 
0.1%
Other values (989)989
99.0%
ValueCountFrequency (%)
0.0031096257741
0.1%
0.0034441035241
0.1%
0.0064407915341
0.1%
0.0074643937411
0.1%
0.0076234634031
0.1%
ValueCountFrequency (%)
0.99891823511
0.1%
0.9981922951
0.1%
0.99415195941
0.1%
0.99362079571
0.1%
0.9922772491
0.1%

4
Real number (ℝ≥0)

UNIQUE

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5032280121
Minimum0.001223057276
Maximum0.9987846948
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum0.001223057276
5-th percentile0.05101582825
Q10.2688959518
median0.5037115691
Q30.7478565234
95-th percentile0.9480387195
Maximum0.9987846948
Range0.9975616375
Interquartile range (IQR)0.4789605716

Descriptive statistics

Standard deviation0.2842376007
Coefficient of variation (CV)0.5648286539
Kurtosis-1.153262938
Mean0.5032280121
Median Absolute Deviation (MAD)0.238542805
Skewness-0.005868875243
Sum502.7247841
Variance0.08079101364
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.05864452851
 
0.1%
0.16236808481
 
0.1%
0.87968569971
 
0.1%
0.044669595551
 
0.1%
0.84804517171
 
0.1%
0.29139278251
 
0.1%
0.27312334671
 
0.1%
0.20050652421
 
0.1%
0.3338891551
 
0.1%
0.71870503161
 
0.1%
Other values (989)989
99.0%
ValueCountFrequency (%)
0.0012230572761
0.1%
0.0021389536561
0.1%
0.0036908618641
0.1%
0.0041394089351
0.1%
0.0045108885971
0.1%
ValueCountFrequency (%)
0.99878469481
0.1%
0.9987578351
0.1%
0.99804498511
0.1%
0.99773001441
0.1%
0.99767996811
0.1%

5
Real number (ℝ≥0)

UNIQUE

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5090419508
Minimum0.005313863512
Maximum0.9984778729
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum0.005313863512
5-th percentile0.05816797046
Q10.2745586436
median0.5092126022
Q30.7426676325
95-th percentile0.9524894411
Maximum0.9984778729
Range0.9931640094
Interquartile range (IQR)0.4681089889

Descriptive statistics

Standard deviation0.2834070174
Coefficient of variation (CV)0.5567458967
Kurtosis-1.135675315
Mean0.5090419508
Median Absolute Deviation (MAD)0.2344735903
Skewness-0.02436773506
Sum508.5329089
Variance0.0803195375
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.93930335271
 
0.1%
0.78435770491
 
0.1%
0.83887037731
 
0.1%
0.44012313291
 
0.1%
0.036021551351
 
0.1%
0.14570420421
 
0.1%
0.41536248481
 
0.1%
0.76376849371
 
0.1%
0.59106265171
 
0.1%
0.4944886331
 
0.1%
Other values (989)989
99.0%
ValueCountFrequency (%)
0.0053138635121
0.1%
0.0056494479071
0.1%
0.0064963018521
0.1%
0.0066687064251
0.1%
0.0069511658511
0.1%
ValueCountFrequency (%)
0.99847787291
0.1%
0.99819893131
0.1%
0.99790299291
0.1%
0.99733826381
0.1%
0.99647028261
0.1%

6
Real number (ℝ≥0)

UNIQUE

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5135112564
Minimum0.0005168188363
Maximum0.9990590557
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum0.0005168188363
5-th percentile0.04917674628
Q10.2554089812
median0.5274046387
Q30.7671528421
95-th percentile0.9518787304
Maximum0.9990590557
Range0.9985422369
Interquartile range (IQR)0.5117438609

Descriptive statistics

Standard deviation0.2912013869
Coefficient of variation (CV)0.5670788775
Kurtosis-1.213893633
Mean0.5135112564
Median Absolute Deviation (MAD)0.2526072282
Skewness-0.08903805937
Sum512.9977451
Variance0.08479824772
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.68617204481
 
0.1%
0.52626512571
 
0.1%
0.20161864811
 
0.1%
0.72177571061
 
0.1%
0.31941118461
 
0.1%
0.00051681883631
 
0.1%
0.16827644941
 
0.1%
0.12080965351
 
0.1%
0.39746045041
 
0.1%
0.60160978931
 
0.1%
Other values (989)989
99.0%
ValueCountFrequency (%)
0.00051681883631
0.1%
0.0010390782261
0.1%
0.0013802438041
0.1%
0.0018820473921
0.1%
0.0038679060531
0.1%
ValueCountFrequency (%)
0.99905905571
0.1%
0.99761565051
0.1%
0.99069583111
0.1%
0.98947229681
0.1%
0.98922312911
0.1%

7
Real number (ℝ≥0)

UNIQUE

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4870195881
Minimum0.0004930354189
Maximum0.999179194
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum0.0004930354189
5-th percentile0.03986519955
Q10.2376704538
median0.4745734276
Q30.7390013927
95-th percentile0.9330030491
Maximum0.999179194
Range0.9986861586
Interquartile range (IQR)0.5013309389

Descriptive statistics

Standard deviation0.28853689
Coefficient of variation (CV)0.5924543839
Kurtosis-1.200191796
Mean0.4870195881
Median Absolute Deviation (MAD)0.2510147719
Skewness0.05128721269
Sum486.5325685
Variance0.08325353689
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.45972941721
 
0.1%
0.36159891711
 
0.1%
0.17614689961
 
0.1%
0.84213619871
 
0.1%
0.75911430431
 
0.1%
0.92774448961
 
0.1%
0.27984781771
 
0.1%
0.18510672591
 
0.1%
0.96950125881
 
0.1%
0.9713876351
 
0.1%
Other values (989)989
99.0%
ValueCountFrequency (%)
0.00049303541891
0.1%
0.0014372125731
0.1%
0.0021687550471
0.1%
0.0022751737851
0.1%
0.0036626441871
0.1%
ValueCountFrequency (%)
0.9991791941
0.1%
0.99791088331
0.1%
0.99689040821
0.1%
0.99686666971
0.1%
0.99535173921
0.1%

8
Real number (ℝ≥0)

UNIQUE

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4989466279
Minimum0.0004075991455
Maximum0.9986462623
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum0.0004075991455
5-th percentile0.04964895954
Q10.2560673617
median0.5018658189
Q30.737044692
95-th percentile0.9451366704
Maximum0.9986462623
Range0.9982386632
Interquartile range (IQR)0.4809773304

Descriptive statistics

Standard deviation0.2860873305
Coefficient of variation (CV)0.5733826316
Kurtosis-1.201680741
Mean0.4989466279
Median Absolute Deviation (MAD)0.2406459968
Skewness0.008081751893
Sum498.4476812
Variance0.08184596069
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.43613105761
 
0.1%
0.72172234531
 
0.1%
0.42213126711
 
0.1%
0.13009470891
 
0.1%
0.039773736611
 
0.1%
0.77438016771
 
0.1%
0.4940318911
 
0.1%
0.27219511921
 
0.1%
0.083559631141
 
0.1%
0.0073296059851
 
0.1%
Other values (989)989
99.0%
ValueCountFrequency (%)
0.00040759914551
0.1%
0.0045668333771
0.1%
0.0048197936271
0.1%
0.00501645241
0.1%
0.0052490970121
0.1%
ValueCountFrequency (%)
0.99864626231
0.1%
0.99844628251
0.1%
0.99778709751
0.1%
0.99740617281
0.1%
0.99630194551
0.1%

9
Real number (ℝ≥0)

UNIQUE

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5051851284
Minimum0.007883363403
Maximum0.9988844888
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum0.007883363403
5-th percentile0.06057471244
Q10.2621554557
median0.518658031
Q30.7482194676
95-th percentile0.9436818145
Maximum0.9988844888
Range0.9910011254
Interquartile range (IQR)0.4860640119

Descriptive statistics

Standard deviation0.2829511131
Coefficient of variation (CV)0.5600939086
Kurtosis-1.189992985
Mean0.5051851284
Median Absolute Deviation (MAD)0.2438255055
Skewness-0.06801291502
Sum504.6799433
Variance0.08006133242
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.73658010761
 
0.1%
0.40759447121
 
0.1%
0.23510742631
 
0.1%
0.22766427771
 
0.1%
0.1622135531
 
0.1%
0.6325089451
 
0.1%
0.87479756311
 
0.1%
0.43387296561
 
0.1%
0.74845858291
 
0.1%
0.49281460981
 
0.1%
Other values (989)989
99.0%
ValueCountFrequency (%)
0.0078833634031
0.1%
0.0084863752131
0.1%
0.0094436474611
0.1%
0.0095534462021
0.1%
0.011772678471
0.1%
ValueCountFrequency (%)
0.99888448881
0.1%
0.99736498411
0.1%
0.99711727561
0.1%
0.99520332761
0.1%
0.99426557361
0.1%

10
Real number (ℝ≥0)

UNIQUE

Distinct999
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4991376114
Minimum0.0001695572864
Maximum0.9988191046
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum0.0001695572864
5-th percentile0.05317098475
Q10.2495816054
median0.5095781707
Q30.7423927027
95-th percentile0.9408351386
Maximum0.9988191046
Range0.9986495473
Interquartile range (IQR)0.4928110973

Descriptive statistics

Standard deviation0.2842145647
Coefficient of variation (CV)0.569411237
Kurtosis-1.188904605
Mean0.4991376114
Median Absolute Deviation (MAD)0.2444973388
Skewness0.003290464751
Sum498.6384738
Variance0.0807779188
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.74404595051
 
0.1%
0.83260915541
 
0.1%
0.23889336361
 
0.1%
0.96929586631
 
0.1%
0.50848977151
 
0.1%
0.90198677131
 
0.1%
0.86375976841
 
0.1%
0.65432174411
 
0.1%
0.88522694541
 
0.1%
0.69513899411
 
0.1%
Other values (989)989
99.0%
ValueCountFrequency (%)
0.00016955728641
0.1%
0.0021838168611
0.1%
0.0030505326581
0.1%
0.0034604242541
0.1%
0.0050365654751
0.1%
ValueCountFrequency (%)
0.99881910461
0.1%
0.99835961641
0.1%
0.99793156911
0.1%
0.99603315331
0.1%
0.99571042951
0.1%

F
Real number (ℝ≥0)

Distinct27
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.01401401
Minimum3
Maximum29
Zeros0
Zeros (%)0.0%
Memory size7.9 KiB

Quantile statistics

Minimum3
5-th percentile7
Q112
median15
Q318
95-th percentile23
Maximum29
Range26
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.773612737
Coefficient of variation (CV)0.3179438045
Kurtosis-0.3636394757
Mean15.01401401
Median Absolute Deviation (MAD)3
Skewness-0.006369707158
Sum14999
Variance22.78737856
MonotocityNot monotonic
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
1689
 
8.9%
1488
 
8.8%
1784
 
8.4%
1870
 
7.0%
1570
 
7.0%
1365
 
6.5%
1262
 
6.2%
1955
 
5.5%
1153
 
5.3%
953
 
5.3%
Other values (17)310
31.0%
ValueCountFrequency (%)
33
 
0.3%
46
 
0.6%
514
1.4%
616
1.6%
719
1.9%
ValueCountFrequency (%)
291
 
0.1%
281
 
0.1%
273
 
0.3%
265
 
0.5%
2513
1.3%

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

12345678910F
00.9683790.7168200.6788460.5100490.5452990.5736920.8174910.2066360.8801110.57418717
10.4682630.4056030.3203250.4680840.1009950.9744070.3000620.6083560.0500820.76201112
20.7768200.1037400.3737480.7977820.9068100.5990690.6266800.9247050.6327680.51266316
30.4078860.4153570.1015430.9351600.4441240.1936070.6568370.5999290.8685820.40540620
40.5387970.0886210.8861190.7818480.4016230.2076520.9463100.7694530.8272920.93718315
50.2068880.9485970.5588840.8717030.4659460.2764750.9899650.7045210.2746290.63253717
60.1871040.9589260.3337850.5531030.3212720.0932460.9920680.6838780.8265430.33998514
70.7799700.7332820.6403200.9884890.5031980.2833500.4120050.6553020.0251550.12086623
80.1939440.8109330.4449940.3305720.5182160.8451870.5258410.7133560.6661830.85687311
90.4342310.7719140.4658470.3711570.8953350.6193490.5976940.7937470.6106310.36371817

Last rows

12345678910F
9890.6752300.1045780.0892800.9138770.2125720.7894060.7153450.2504230.7782380.39281416
9900.4669810.7259280.9254220.2832790.5433740.2982450.0122020.9566300.7513060.80175218
9910.7162540.6710170.0470220.8841410.0469270.6838500.9640380.0378300.8352100.15048924
9920.1305150.5342160.1230690.6253520.1459120.8557140.4915310.8051960.6907370.31757612
9930.5895120.2942390.8545850.2234560.0670460.1281680.8619340.2936470.7028130.67051311
9940.1628430.4359900.1499170.6655050.9145840.6303360.3848810.4072240.3287570.99452716
9950.9071790.7371430.8309320.7995330.2146020.8000220.5045000.4946880.4629600.27662620
9960.5058300.1637470.1468210.6672860.5733770.5855630.9341530.3122940.8548590.60759215
9970.2963630.7202000.4345140.7552180.6134710.9403390.3294130.7633450.0182500.47462617
9980.1812830.9661390.5026680.9858030.2853970.9639760.2832930.8795720.1395730.52307617